Zero-Shot Object Detection

ZSL based on bounding box features
- [1]: use background bounding boxes from background classes
- [3]: classification loss with semantic clustering
End-to-end zero-shot object detection
- [2]: extend YOLO, concatenate three feature maps to predict confidence score.
- [4]: use polarity loss similar to focal loss and vocabulary to enhance word vector
- [5]: output both classification scores and semantic embeddings
Feature generation
- [6]: synthesize
  visual features for unseen classes
- [7]: semantics-preserving graph propagation modules that enhance both category and region representations

Reference

[1] Ankan Bansal, Karan Sikka, Gaurav Sharma, Rama Chellappa, Ajay Divakaran, “Zero-Shot Object Detection”, ECCV, 2018.

[2] Pengkai Zhu, Hanxiao Wang, and Venkatesh Saligrama, “Zero Shot Detection”, T-CSVT, 2019.

[3] Rahman, Shafin, Salman Khan, and Fatih Porikli. “Zero-shot object detection: Learning to simultaneously recognize and localize novel concepts.” arXiv preprint arXiv:1803.06049 (2018).

[4] Rahman, Shafin, Salman Khan, and Nick Barnes. “Polarity Loss for Zero-shot Object Detection.” arXiv preprint arXiv:1811.08982 (2018).

[5] Demirel, Berkan, Ramazan Gokberk Cinbis, and Nazli Ikizler-Cinbis. “Zero-Shot Object Detection by Hybrid Region Embedding.” arXiv preprint arXiv:1805.06157 (2018).

[6] Hayat, Nasir, et al. “Synthesizing the unseen for zero-shot object detection.” Proceedings of the Asian Conference on Computer Vision. 2020.

[7] Yan, Caixia, et al. “Semantics-preserving graph propagation for zero-shot object detection.” IEEE Transactions on Image Processing 29 (2020): 8163-8176.